Semantic Outlier Analysis for Sessionizing Web Logs

نویسنده

  • Jason J. Jung
چکیده

As the web usage patterns from clients are getting more complex, simple sessionizations based on time and navigation-oriented heuristics have been restricted to exploit various kinds of rule discoverying methods. In this paper, we present semantic session reconstruction based on semantic outliers from web log data. Above all, web directory service such as Yahoo is applied to enrich semantics to web logs, as categorizing them to all possible hierarchical paths. In order to detect the candidate set of session identifiers, semantic factors like semantic mean, deviation, and distance matrix are established. Eventually, each semantic session is obtained based on nested repetition of top-down partitioning and evaluation process. For experiment, we applied this ontologyoriented heuristics to sessionize the access log files for one week from IRCache. Compared with time-oriented heuristics, more than 48% of sessions were additionally detected by semantic outlier analysis. It means that we can conceptually track the behavior of users tending to easily change their intentions and interests, or simultaneously try to search various kinds of information on the web.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On Creating Adaptive Web Servers Using Weblog Mining

Personalization of content returned from a web site is an important problem in general, and affects e-commerce and e-services in particular. Targeting appropriate information or products to the end user can significantly change (for the better) the users experience on a web site. One possible approach to web personalization is to mine typical user profiles from the vast amount of historical dat...

متن کامل

Measuring the Accuracy of Sessionizers for Web Usage Analysis

Companies with web presence rely on web usage analysis to obtain insights on customer behavior, associations among products, impact of advertisement banners, web marketing campaigns and product promotions. The validity of these results depends heavily on the accurate reconstruction of the visitors' activities in the web site. To this end, many sites employ cookies that distinguish among di eren...

متن کامل

Enabling Semantic Analysis of User Browsing Patterns in the Web of Data

A useful step towards better interpretation and analysis of the usage patterns is to formalize the semantics of the resources that users are accessing in the Web. We focus on this problem and present an approach for the semantic formalization of usage logs, which lays the basis for effective techniques of querying expressive usage patterns. We also present a query answering approach, which is u...

متن کامل

مقایسه وبلاگ های کتابخانه ها و کتابداران ایرانی با وبلاگ های برتر کتابداری؛1385

Introduction: Web logs are the evident tools for the librarians. There are three main ways for applying web logs in librarianship fields, as follows: personal use by librarian to upgrade their personal information, as a source of information in case of libraries, and for their services. The aim of this research is to comparison between Iranian libraries and librarians, and superior librarianshi...

متن کامل

(Semantic web) evolution through change logs: Problems and solutions

Knowledge evolution is currently a hot research topic within the Semantic Web community. This paper investigates a change-based Semantic Web, according to which any modification applied to an ontology should be logged. The merits of this approach for supporting the process of evolution are discussed. Subsequently, a number of basic problems concerning the management of such change logs are intr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003